Aggregation Algorithms for Very Large Compressed Data Warehouses

نویسندگان

  • Jianzhong Li
  • Doron Rotem
  • Jaideep Srivastava
چکیده

Many efficient algorithms to compute multidimensional aggregation and Cube for relational OLAP have been developed. However, to our knowledge, there is nothing to date in the literature on aggregation algorithms on compressed data warehouses for multidimensional OLAP. This paper presents a set of aggregation algorithms on very large compressed data warehouses for multidimensional OLAP. These algorithms operate directly on compressed datasets without the need to first decompress them. They are applicable to data warehouses that are compressed using variety of data compression methods. The algorithms have different performance behavior as a function of dataset parameters, sizes of outputs and main memory availability. The analysis and experimental results show that the algorithms have better performance than the traditional aggregation algorithms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Aggregation Algorithms for Compressed Data Warehouses

ÐAggregation and cube are important operations for online analytical processing (OLAP). Many efficient algorithms to compute aggregation and cube for relational OLAP have been developed. Some work has been done on efficiently computing cube for multidimensional data warehouses that store data sets in multidimensional arrays rather than in tables. However, to our knowledge, there is nothing to d...

متن کامل

روشی کارا برای کاوش مجموعه اقلام پرتکرار در تحلیل داده‌های سبد خرید

Discovery of hidden and valuable knowledge from large data warehouses is an important research area and has attracted the attention of many researchers in recent years. Most of Association Rule Mining (ARM) algorithms start by searching for frequent itemsets by scanning the whole database repeatedly and enumerating the occurrences of each candidate itemset. In data mining problems, the size of ...

متن کامل

pCube: Update-Efficient Online Aggregation with Progressive Feedback and Error Bounds

Multidimensional data cubes are used in large data warehouses as a tool for online aggregation of information. As the number of dimensions increases, supporting efficient queries as well as updates to the data cube becomes difficult. Another problem that arises with increased dimensionality is the sparseness of the data space. In this paper we develop a new data structure referred to as the pCu...

متن کامل

Accessing Data in Block-Compressed Data Warehouses

The large size of most data warehouses (typically hundreds of gigabytes to terabytes), which results in non-trivial storage costs, makes compression techniques attractive for warehousing environments. In particular, block-level compression (as opposed to attribute or record level schemes) has been shown to achieve the greatest reductions in storage size for databases. A key issue is how to quic...

متن کامل

Single Stock Dynamics on High-Frequency Data: From a Compressed Coding Perspective

High-frequency return, trading volume and transaction number are digitally coded via a nonparametric computing algorithm, called hierarchical factor segmentation (HFS), and then are coupled together to reveal a single stock dynamics without global state-space structural assumptions. The base-8 digital coding sequence, which is capable of revealing contrasting aggregation against sparsity of ext...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999